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Introduction and Dedication 


This book is dedicated to Paul Erdos, the greatest mathematician I 
have ever known, whom it has been my rare privilege to consider 
colleague, collaborator, and dear friend. 

I like to think that Erdés, whose mathematics embodied the princi- 
ples which have impressed themselves upon me as defining the true 
character of mathematics, would have appreciated this little book 
and heartily endorsed its philosophy. This book proffers the thesis 
that mathematics is actually an easy subject and many of the famous 
problems, even those in number theory itself, which have famously 
difficult solutions, can be resolved in simple and more direct terms. 

There is no doubt a certain presumptuousness in this claim. The 
great mathematicians of yesteryear, those working in number the- 
ory and related fields, did not necessarily strive to effect the simple 
solution. They may have felt that the status and importance of mathe- 
matics as an intellectual discipline entailed, perhaps indeed required, 
a weighty solution. Gauss was certainly a wordy master and Euler 
another. They belonged to a tradition that undoubtedly revered math- 
ematics, but as a discipline at some considerable remove from the 
commonplace. In keeping with a more democratic concept of intelli- 
gence itself, contemporary mathematics diverges from this somewhat 
elitist view. The simple approach implies a mathematics generally 
available even to those who have not been favored with the natural 
endowments, nor the careful cultivation of an Euler or Gauss. 


vii 


Vili Introduction and Dedication 


Such an attitude might prove an effective antidote to a generally 
declining interest in pure mathematics. But it is not so much as incen- 
tive that we proffer what might best be called “the fun and games” 
approach to mathematics, but as a revelation of its true nature. The 
insistence on simplicity asserts a mathematics that is both “magi- 
cal” and coherent. The solution that strives to master these qualities 
restores to mathematics that element of adventure that has always 
supplied its peculiar excitement. That adventure is intrinsic to even 
the most elementary description of analytic number theory. 

The initial step in the investigation of a number theoretic item 
is the formulation of “the generating function”. This formulation 
inevitably moves us away from the designated subject to a consider- 
ation of complex variables. Having wandered away from our subject, 
it becomes necessary to effect a return. Toward this end “The Cauchy 
Integral” proves to be an indispensable tool. Yet it leads us, inevitably, 
further afield from all the intricacies of contour integration and they, 
in turn entail the familiar processes, the deformation and estimation 
of these contour integrals. 

Retracing our steps we find that we have gone from number theory 
to function theory, and back again. The journey seems circuitous, yet 
in its wake a pattern is revealed that implies a mathematics deeply 
inter-connected and cohesive. 


I 


The Idea of Analytic Number 
Theory 


The most intriguing thing about Analytic Number Theory (the use of 
Analysis, or function theory, in number theory) is its very existence! 
How could one use properties of continuous valued functions to de- 
termine properties of those most discrete items, the integers. Analytic 
functions? What has differentiability got to do with counting? The 
astonishment mounts further when we learn that the complex zeros 
of a certain analytic function are the basic tools in the investigation 
of the primes. 

The answer to all this bewilderment is given by the two words 
generating functions. Well, there are answers and answers. To those 
of us who have witnessed the use of generating functions this is a kind 
of answer, but to those of us who haven’t, this is simply a restatement 
of the question. Perhaps the best way to understand the use of the 
analytic method, or the use of generating functions, is to see it in 
action in a number of pertinent examples. So let us take a look at 
some of these. 


Addition Problems 


Questions about addition lend themselves very naturally to the use of 
generating functions. The link is the simple observation that adding 
m and n is isomorphic to multiplying z” and z”. Thereby questions 
about the addition of integers are transformed into questions about 
the multiplication of polynomials or power series. For example, La- 
grange’s beautiful theorem that every positive integer is the sum of 
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four squares becomes the statement that all of the coefficients of the 
4 


power series for (1 ey ay ae ere mre ee ee :) are positive. How 
one proves such a fact about the coefficients of such a power series 
is another story, but at least one begins to see how this transition 
from integers to analytic functions takes place. But now let’s look at 
some addition problems that we can solve completely by the analytic 
method. 


Change Making 


How many ways can one make change of a dollar? The answer is 
293, but the problem is both too hard and too easy. Too hard because 
the available coins are so many and so diverse. Too easy because it 
concerns just one “changee,” a dollar. More fitting to our spirit is the 
following problem: How many ways can we make change for n if the 
coins are 1, 2, and 3? To form the appropriate generating function, 
let us write, for |z| < 1, 


1 
Page ee ase tenes 
1 
ape ee eee 
1 3 343 343+3 
foe + Z + 2Z ae 
and multiplying these three equations to get 


1 
ad — zd — 22)d —- 23) 
=(ltzt¢2's+..jd4+247P 4...) 
a ee ae eae 


Now we ask ourselves: What happens when we multiply out the 
right-hand side? We obtain terms like z't!*'*! . z? - z3+3. On the one 
hand, this term is z!?, but, on the other hand, it is zfu"!’stone2+two3's 
and doesn’t this exactly correspond to the method of changing the 
amount 12 into four 1’s, one 2, and two 3’s? Yes, and in fact we 
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see that “every” way of making change (into 1’s, 2’s, and 3’s) for 
“every” n will appear in this multiplying out. Thus if we call C (7) the 
number of ways of making change for n, then C (1) will be the exact 
coefficient of z” when the multiplication is effected. (Furthermore 
all is rigorous and not just formal, since we have restricted ourselves 
to |z| < 1 wherein convergence is absolute.) 

Thus 


i 1 
a CWn" = d—2d — 2) — 23)’ () 


and the generating function for our unknown quantity C(n) is 
produced. Our number theoretic problem has been translated into 
a problem about analytic functions, namely, finding the Taylor 
coefficients of the function G—5q=yq=y - 

Fine. A well defined analytic problem, but how to solve it? We must 
resist the temptation to solve this problem by undoing the analysis 
which as to its formulation. Thus the thing not to do is expand =+- 

[2 Tor Fespectively into }> 2%, D> 2°”, 5° z* and multiply enly to 
discover that the coefficient is the number of ways of making change 
for n. 

The correct answer, in this case, comes from an algebraic tech- 
nique that we all learned in calculus, namely partial fractions. Recall 
that this leads to terms like Tan for which we know the expan- 
sion explicitly nately, drank is just a constant times the (k — 1)th 
derivative of g=75 = )/ @"2"). 

Carrying oie ine algebra, then, leads to the partial fractional 
decomposition which we may arrange in the following form: 


1 
(1 — z)(1 — 27) — 23) 
“yl 1 re 1 ie 1 1 1 
“$§@=2 4G—2e 40> 3 Goo 


Thus, since 


1 d : 
ao STS at = Let De 
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and 
1 d 1 d n+1, 
f=) a ee 2° 
(ie 2) seakye 3 
ye 
— @4+2M41) ntl mr) w@) 
C(n) = e ++ i — 4 + 3 (2) 


where x;(n) = 1 if 2|n and = 0 otherwise; x2.(n) = 1 if3|n 
and = 0 else. A somewhat cumbersome formula, but one which can 
be shortened nicely into 


% Lear Fe 3 
WO Vg ate tL: (3) 
where the terms in the brackets mean the greatest integers. 

A nice crisp exact formula, but these are rare. Imagine the mess 
that occurs if the coins were the usual coins of the realm, namely 1, 5, 
10, 25, 50, (100?). The right thing to ask for then is an “asymptotic” 
formula rather than an exact one. 

Recall that an asymptotic formula F (n) for a function f (1) is one 
for which lim,_. 6. ae = |. In the colorful language of E. Landau, 
the relative error in replacing f(n) by F(n) is eventually 0%. At 
any rate, we write f(”) ~ F(n) when this occurs. One famous such 
example is Stirling’s formula n! ~ /27n(2)". (Also note that our 


result (3) can be weakened to C(n) ~ n : 

So let us assume quite generally that there are coins a), a2, d3,..., 
a,, where to avoid trivial congruence considerations we will require 
that there be no common divisiors other than 1. In this generality we 
ask for an asymptotic formula for the corresponding C(n). As before 
we find that the generating function is given by 


1 
5 a : ‘ 
C(n)z (1 = zay(] = za2) er el = zt) ( ) 


But the next step, explicitly finding the partial fractional decompo- 
sition of this function is the hopeless task. However, let us simply 
look for one of the terms in this expansion, the heaviest one. Thus 


Crazy Dice 5 


at z = 1 the denominator has a k-fold zero and so there will be a 
term fae All the other zeros are roots of unity and, because we 
assumed no common divisiors, all will be of ordet lower than k. 


Thus, although the coefficient of the term 7 aa is re (Gi ); the 
coefficients of all other terms 7-55; will be ae Ce ). Since all of 
these j are less than k, the sum total of all of these terms is negligible 


compared to our heavy term aaa '). In short C(n) ~ eo) or 
even simpler, 
k-1 
(k — 1)! 
But, what is c? Although we have deftly avoided the necessity of 


finding all of the other terms, we cannot avoid this one (it’s the whole 
story!). So let us write 


1 Cc 
(l—2%)(1—2@)---(l— 2%) (l=) 
multiply by (1 — z)* to get 


C(n) ~ c 


+ other terms, 


l1—z l-z l1—z 
T— z J — z@ 1 — 7% 


=c+(1—z)* x other terms, 


+ 


and finally let z > 1. By L’Hopital’s rule, for example, = i 
whereas each of ne other terms times (1 — z)* goes to 0. The final 


result is c = ———.,, and our final asymptotic formula reads 
daz ax’ 


AeA 
COS a\a,-+-a,(k — 1)! ©) 


Crazy Dice 


An ordinary pair of dice consist of two cubes each numbered 1 
through 6. When tossed together there are altogether 36 (equally 
likely) outcomes. Thus the sums go from 2 to 12 with varied 
numbers of repeats for these possibilities. In terms of our ana- 
lytic representation, each die is associated with the polynomial 
zt 2424244 2 + 2°. The combined possibilities for the 
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sums then are the terms of the product 
Zt+P74tPeeAe Pt Azt- 747474742) 
= 7° +27? + 324 + 42? + 5z° + 62’ 
+ 52° + 477 + 37) + 2zlh + 2? 


The correspondence, for example, says that there are 3 ways for the 
10 to show up, the coefficients of z'° being 3, etc. The question is: Is 
there any other way to number these two cubes with positive integers 
so as to achieve the very same alternatives? 

Analytically, then, the question amounts to the existence of 
positive integers, a,,...,d6; bi, ..., b6, so that 


(2 +--+ + 2%)(2) +++. + 2°) 
= 2 +22) 4+ 324 +--+ 329 422" + 2”. 


These would be the “Crazy Dice” referred to in the title of this sec- 
tion. They look totally different from ordinary dice but they produce 
exactly the same results! 

So, repeating the question, can 


(2 + ++ + 2%)(27! + +. + 25) 
=(Zz4¢24+7474742°) (6) 
Gee ee ee ee? ts 20 


To analyze this possibility, let us factor completely (over the ratio- 
nals) this right-hand side. Thus z+ 27 +27 +z4+2°+2° =z re = 
zi+z+27(+z23) = zd+z4+27)(1+z)(1—z+2z7). We conclude 
from (6) that the “a-polynomial” and “b-polynomial” must consist of 
these factors. Also there are certain side restrictions. The a’s and b’s 
are to be positive and so a z-factor must appear in both polynomials. 
The a-polynomial must be 6 at z = 1 and so the (1 + z+ 2’)(1 +z) 
factor must appear in it, and similarly in the b-polynomial. All that 
is left to distribute are the two factors of 1 — z + z. If one apiece are 
given to the a- and b-polynomials, then we get ordinary dice. The 
only thing left to try is putting both into the a-polynomial. 


Crazy Dice 7 
This works! We obtain finally 
> Z=z2lt+z+%d4¢nd—-z+27y 
gg gt eg eg? Ae g® 


and 
been =2lt+z2+27)1+2) =z2+227 +22 427. 
Translating back, the crazy dice are 1,3,4,5,6,8 and 1,2,2,3,3,4. 
Now we introduce the notion of the representation function. So, 
suppose there is a set A of nonnegative integers and that we wish to 
express the number of ways in which a given integer n can be written 
as the sum of two of them. The trouble is that we must decide on 


conventions. Does order count? Can the two summands be equal? 
Therefore we introduce three representation functions. 


r(n) = #{(a, a’): a,a €¢ A,n=at+a}; 
So here order counts, and they can be equal; 
ri(n) = #{(a,a’):a,a'€A,a<a,n=a+t+a}, 
order doesn’t count, and they can be equal; 
r_(n) = #{(a,a’):a,a'e€ Ajsa<a,n=ata’, 


order doesn’t count, and they can’t be equal. In terms of the generat- 
ing function for the set A, namely, A(z) = )0-4 2", We can express 
the generating functions of these representation functions. 

The simplest is that of r(n), where obviously 


Yortn)z” = A’). (7) 


To deal with r_(n), we must subtract A(z’) from A(z) to remove 
the case of a = a’ and then divide by 2 to remove the order. So here 


n_ I 2 -_ 2 
ems = 5 lA (z) — A(z*)]. (8) 
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Finally for r,(n), we must add A(z”) to this result to reinstate the 
case of a = a’, and we obtain 


n l 2 2 
Dote@)2" = 5[4°@ + AQ]. (9) 


Can r(n) be “constant?” 


Is it possible to design a nontrivial set A, so that, say, r; (7) is the same 
for all n? The answer is NO, for we would have to have 0 € A. And 
then 1 € A, elser,(1) 4 r,(0). And then 2 ¢ A, else r,(2) = 2. 
And then 3 € A, else r,(3) = 0 (whereas r,(1) = 1), then 4 ¢ A, 
else r,(4) = 2. Continuing in this manner, we find 5 € A. But now 
we are stymied since now 6 = 1+ 5,6 = 3 + 3, andr,(6) = 2. 

The suspicion arises, though, that this impossibility may just be 
a quirk of “small” numbers. Couldn’t A be designed so that, except 
for some misbehavior at the beginning, r,(n) = constant? 

We will analyze this question by using generating functions. So, 
using (9), the question reduces to whether there is an infinite set A 
for which 

ees ; C 
rie (Z) FA) P@)+ Te: (10) 
P(z) is a polynomial. 

Answer: No. Just look what happens if we let z — (—1)*. Clearly 
P(z) and iS remain bounded, A(z) remains nonnegative, and 
A(z”) goes to A(1) = 00, a contradiction. 


A Splitting Problem 


Can we split the nonnegative integers in two sets A and B so that 
every integer n is expressible in the same number of ways as the 
sum of two distinct members of A, as it is as the sum of two distinct 
members of B? 

If we experiment a bit, before we get down to business, and begin 
by placing 0 € A, then 1 € B, else 1 would be expressible as 
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a +a’ but not as b + Db’. Next 2 € B, else 2 would be a + a’ but 
not b + b’. Next 3 € A, else 3 would not be a + a’ whereas it 
isb +b’ = 1 + 2. Continuing in this manner, we seem to force 
A = {0, 3,5, 6,9,---}and B = {1, 2, 4, 7, 8, ---}. But the pattern 
is not clear, nor is the existence or uniqueness of the desired A, B. We 
must turn to generating functions. So observe that we are requiring 
by (8) that 


l 
5 IA) — A(z’)] = 5[B°C) — B(z’)). (11) 


Also, because of the condition that A, B be a splitting of the 
nonnegatives, we also have the condition that 


A(z) "— Be) = = (12) 
From (11) we obtain 
A*(z) — B?(z) = A(z’) — B(z?), (13) 
and so, by (12), we conclude that 
[A(z) — B(z)] - — = A(z’) — B(2"), 
or 
A(z) — B(z) = (1 — 2)[A(’) — B(’)]. (14) 
Now this is a relationship that can be iterated. We see that 
AG) = BR) = 1 = ZAG) = Be), 
so that continuing gives 
A) — B@) = (1-2) — 2°)[AG") — B*)]. 
And, if we continue to iterate, we obtain 


A(z) =B@) =A-270a— 2’) oy de 2") | AG”) 7 B")] 
(15) 
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and so, by letting n — oo, since A(0) = 1, B(O) = 0, we deduce 
that 


A(z) — B@) = [Ja - 2”). (16) 
i=0 


And this product is easy to “multiply out”. Every term z” occurs 
uniquely since every n is uniquely the sum of distinct powers of 2. 
Indeed z” occurs with coefficient +1 ifn is the sum of an even number 
of distinct powers of 2, and it has coefficient —1, otherwise. 

We have achieved success! The sets A and B do exist, are unique, 
and indeed are given by A = Integers, which are the sum of an even 
number of distinct powers of 2, and B = Integers, which are the sum 
of an odd number of distinct powers of 2. This is not one of those 
problems where, after the answer is exposed, one proclaims, “oh, of 
course.” It isn’t really trivial, even in retrospect, why the A and B 
have the same r_(n), or for that matter, to what this common r_(n) 
is equal. (See below where it is proved that r_(27*+! — 1) = 0.) 

A = Integers with an even number of 1’s in radix 2. Then and 
only then 


2k+1 
—— 
Mies So a4 


is not the sum of two distinct A’s. 


PROOF. A sum of two A’s, with no carries has an even number of 
odd 


1’s (so it won’t give Ties ), else look at the first carry. This gives 
a 0 digit so, again, it’s not 11--- 1. 

So r_(27*+! — 1) = 0. We must now show that all other n have 
a representation as the sum of two numbers whose numbers of | 
digits are of like parity. First of all ifm contains 2k 1’s then it is the 
sum of the first k and the second k. Secondly if n contains 2k + 1 
1’s but also a 0 digit then it is structured as 111 ---oA where A 


contains 2k + 1 — m 1’s and, say, is of total length L then it can be 
expressed as 111 ---1000---00 plus 1A and these two numbers 
m—1 2 
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have respectively m 1’s and 2k + 2 — m 1’s. These are again of like 
parity so we are done. 


An Identity of Euler’s 


Consider expressing n as the sum of distinct positive integers, 1.e., 
where repeats are not allowed. (So Forn = 6, we have the expression 
1+2+3andalso2 + 4, 1 + 5, and just plain 6 alone.) 

Also consider expressing n as the sum of positive odd numbers, 
but this time where repeats are allowed. (So forn = 6, we get 1 +5, 
34+3,14+14+14+3,1+1+4+1+1+4+1+4 1.) In both cases we 
obtained four expressions for 6, and a theorem of Euler’s says that 
this is no coincidence, that is, it says the following: 


Theorem. The number of ways of expressing n as the sum of distinct 
positive integers equals the number of ways of expressing n as the 
sum of (not necessarily distinct) odd positive integers. 


To prove this theorem we produce two generating functions. The 
latter is exactly the “coin changing” function where the coins have 
the denominations 1, 3, 5, 7,.... This generating function is given 
by 

1 
= 2g 2) eye" 
The other generating function is not of the coin changing variety 
because of the distinctness condition. A moment’s thought, however, 
shows that this generating function is given as the product of 1 + z, 
1+2?,14+23,.... For, when these are multiplied out, each z‘ factor 
occurs at most once. In short, the other generating function is 


(+ 22d BE) as (18) 
Euler’s theorem in its analytic form is then just the identity 
i 


A= 2) 2 oe) ss 
throughout |z| < 1. (19) 


(17) 


=(1+204+27)04+2):-: 
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Another way of writing (19) is 
Code) 02s G ad abe) as< S120) 


which is the provocative assertion that, when this product is 
multiplied out, all of the terms (aside from the 1) cancel each other! 

To prove (2) multiply the 1 — z by the 1 + z (to get 1 — z*) and 
do the same with 1 — z> by 1 + z°, etc. This gives the new factors 
1 — 27, 1 — z°, 1 — z!®,--- and leaves untouched the old factors 
14+ 2*,1+24,1+42°,---. These rearrangements are justified by 
absolute convergence, and so we see that the product in (20), call it 
P(z), is equal to 


dQ-2)d—2)d-2z)---d+t2d4+2')::- 


which just happens to be P(z*)! So P(z) = P(z?) which of course 
means that there can’t be any terms az‘, a 4 0,k # 0, in the 
expansion of P(z), i.e., P(z) is just its constant term 1, as asserted. 


Marks on a Ruler 


Suppose that a 6” ruler is marked as usual at 0, 1, 2, 3, 4, 5, 6. 
Using this ruler we may of course measure any integral length from 
1 through 6. But we don’t need all of these markings to accomplish 
these measurements. Thus we can remove the 2, 3, and 5, and the 
marks at 0, 1, 4, 6 are sufficient. (The 2 can be measured between 4 
and 6, the 3 can be gotten between | and 4, and the 5 between | and 
6.) Since (3) = 6, this is a “perfect” situation. The question suggests 
itself then, are there any /arger perfect values? In short, can there 
be integers a; < ad) < --+ < a, such that the differences a; — aj, 
i > j, take on all the values 1, 2,3,..., (5)? 

If we introduce the usual generating function A(z) = )~7_, 2, 
then the differences are exposed, not when we square A(z), but when 
we multiply A(z) by A(+). Thus A(z) - A(+) = S0) =, 2 and 
if we split this (double) sum asi > j,i = j, andi < j, we obtain 


AlZ) A (<) = 3 za 47 ten + 3 eee 


i,j=l ij=l 
i>j i<j 
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Our “perfect ruler,’ by hypothesis, then requires that the first sum be 
equal to pe won = GC); and since the last sum is the same as 
first, with i replacing z, our equation takes the simple form 


1 al i n 
a@)-(<) Sue +n—-1,N= (°). 


or, summing this geometric series, 


N+1 _ .-N 
ag)-A(2) = +01. vel): (21) 
Zz z—1 2 


In search of a contradiction, we let z lie on the unit circle z = e!°, 
so that the left side of (21) becomes simply | A(e'’)|?, whereas the 
right-hand side is 

ZN+z — z-(N+3) sin(N + tye 


= LS 
sin 50 


and (21) reduces to 


(2 sin mont g 
A(e’’)| = + 4+ n- 1. (22) 
1 
sin 50 


A contradiction will occur, then, if we pick a @ which makes 


sin wal 0 

aoe = =e (23) 
sin 56 

(And we had better assume that n > 5, since we saw the perfect 


ruler forn = 4.) 
n2—n+1 


A good choice, then, is to make sin *>=—@ = —1, for exam- 
ple by picking 0 = : In that case sin 5 < S, at > -, 
2a = SS eee and so the requirement (23) follows 

2 


from — 2nie dnt? < —(n — 1) or 2n? — 2n +2 > 3n(n — 1). But 
2n* — 2n + 2 — 3n(n — 1) > 2n? —2n+2- 10 — 1) = 
2(n — 3)? —6 > 2-2? —6 = 2, forn > 5. There are no perfect 
rulers! 
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Dissection into Arithmetic Progressions 


It is easy enough to split the nonnegative integers into arithmetic 
progressions. For example they split into the evens and the odds or 
into the progressions 2n, 4n + 1, 4n + 3. Indeed there are many 
other ways, but all seem to require at least two of the progressions 
to have same common difference (the evens and odds both have 2 as 
a common difference and the 4n + 1 and 4n + 3 both have 4). So 
the question arises Can the positive integers be split into at least two 
arithmetic progressions any two of which have a distinct common 
difference? 

Of course we look to generating functions for the answer. The 
progression an + b,n = 0,1, 2,... will be associated with the 
function )°~° , z“”*°. Thus the dissection into evens and odds cor- 
responds to the identity )°™, z” = S752" + O°, 27"! and 
the dissection into 2n, 4n + 1, 4n + 3 corresponds to )>™, 2” = 
Deo Ot pg ZT + Vy z*"*3, etc. Since each of these series 


is geometric, we can express their sums by )>°<, z4"1? = ; o, . Our 
question then is exactly whether there can be an identity 
1 zh za zk 
= — Deas ; 
Ls [ge dee ge hee ge 
l<a,<@ <...< q. (24) 


Well, just as the experiment suggested, there cannot be such a dis- 
section, (24) is impossible. To see that (24) does, indeed, lead to a 
contradiction, all we need do is let z > e a and observe that then 
all of the terms in (24) approach finite limits except the last term 
=< which approaches oo. 

Hopefully, then, this chapter has helped take the sting out of the 
preposterous notion of using analysis in number theory. 
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Problems for Chapter I 


1. Produce a set A such that r(n) > 0 foralln inl <n < N, but 


with |A| < /4N +1. 


2. Show that every set satisfying the conditions of (1) must have 


JA] < VN. 


3. Show directly, with no knowledge of Stirling’s formula, that n! > 
Cyr 


I 


The Partition Function 


One of the simplest, most natural, questions one can ask in arithmetic 
is how to determine the number of ways of breaking up a given inte- 
ger. That is, we ask about a positive integer n: In how many ways can 
it be written asa +b+c+.---wherea, b,c, ... are positive inte- 
gers? It turns out that there are two distinct questions here, depending 
on whether we elect to count the order of the summands. If we do 
choose to let the order count, then the problem becomes foo simple. 
The answer is just 2”~' and the proof is just induction. Things are 
incredibly different and more complicated if order is not counted! 

In this case the number of breakups or “partitions” is 1 forn = 1, 
2 torn = 2,3 form =.3, 5 torn. 4,7 torn: =: 5, €¢,.°5 has-the 
representations 1+1+1+1+1,2+14+1+4+1,3+1+4+1,44+1, 
5,3 + 2,2 + 2 + 1, and no others. Remember such expressions 
as 1 + 1 + 2+ 1 are not considered different. The table can be 
extended further of course but no apparent pattern emerges. There 
is a famous story concerning the search for some kind of pattern in 
this table. This is told of Major MacMahon who kept a list of these 
partition numbers arranged one under another up into the hundreds. 
It suddenly occurred to him that, viewed from a distance, the outline 
of the digits seemed to form a parabola! Thus the number of digits 
in p(n), the number of partitions of n, is around C./n, or p(n) itself 
is very roughly e*Y”. The first crude assessment of p(n)! 

Among other things, however, this does tell us not to expect any 


simple answers. Indeed later research showed that the true asymptotic 
oi 


formula for p(n) is 7 an 


, certainly not a formula to be guessed! 
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Now we turn to the analytic number theory derivation of this 
asymptotic formula. 


The Generating Function 


To put into sharp focus the fact that order does not count, we may 
view p(n) as the number of representations of n as a sum of 1’s and 
2’s and 3’s ..., etc. But this is just the “change making” problem 
where coins come in all denominations. The analysis in that problem 
extends verbatim to this one, even though we now have an infinite 
number of coins, So we obtain 


1 


Dee = I] = (1) 


=1 


valid for |z| < 1, where we understand that p(0) = 1. 

Having thus obtained the generating function, we turn to the sec- 
ond stage of attack, investigating the function. This is always the 
tricky (creative?) part of the process. We know pretty well what kind 
of information we desire about p(n): an estimate of its growth, per- 
haps even an asymptotic formula if we are lucky. But we don’t know 
exactly how this translates to the generating function. To grasp the 
connection between the generating function and its coefficients, then, 
seems to be the paramount step. How does one go from one to the 
other? Mainly how does one go from a function to its coefficients? 

It is here that complex numbers really play their most important 
role. The point is that there are formulas (for said coefficients). Thus 
we learned in calculus that, if f(z) = >) a,z", then a, = ro , 
expressing the desired coefficients in terms of high derivatives of the 
function. But this a terrible way of getting at the thing. Except for 
rare “made up” examples there is very little hope of obtaining the nth 
derivative of a given function and even estimating these derivatives 
is not a task with very good prospects. Face it, the calculus approach 
is a flop. 

Cauchy’s theorem gives a different and more promising approach. 
Thus, again with f(z) = >> a,z", this time we have the formula 
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= sh 4, . a dz, an integral rather than a differential operator! 


Surely this is a more secure approach, because integral operators are 
bounded, and differential operators are not. The price we pay is that 
of passing to the complex numbers for our z’s. Not a bad price, is it? 

So let us get under way, but armed with the knowledge that the 
valuable information about f(z) will help in getting a good approx- 
imation to f., ce dz. But a glance at the potentially explosive aT 
shows us that C had better stay as far away from the origin as it can, 
i.e., it must hug the unit circle. Again, a look at our generating func- 
tion )) p(n)z” shows that it’s biggest when z is positive (since the 
coefficients are themselves positive). All in all, we see that we should 
seek approximations to our generating function which are good for 
|z| near 1 with special importance attached to those z’s which are 
near +1. 


The Approximation 


Starting with (1), F(z) = [[Z, =e and taking logarithms, we 
obtain 


log F(z) = } log ; 2 = 
A ee et a 4 

Pe ayes a (2) 
Seal ard oe 


Now write z = e~” so that iw > 0 and obtain log F(e~”) = 
ye S 7 a4 . Thus noticing that the expansion of — begins with 


e*—] 


1 + +c \x +--- or equivalently (near 0) 4 — S +ex+---, 


x 
we rewrite this as 


A 1 1 ee 
log F(e y=) 5 i 
1 1 1 oo 
+Di(ea-st =| GB) 
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The form of this series is very suggestive. Indeed we recognize any 
series > t A(kw) = > ““”’ w as a Riemann sum, approximating 
the Riemann integral / ae dt for small positive w. It should come 
as no surprise then, that such series are estimated rather accurately. 


So let us review the “Riemann sum story”. 


Riemann Sums 


Suppose that #(x) is a positive decreasing function on (0, oo) and 
that h > 0. The Riemann sum }° °°, (kA)h is clearly equal to the 
area of the union of rectangles and so is bounded by the area under 
y = @(x). Hence 7, P(kh)h < f,° o(x)dx. On the other hand, 
the series )°° , @(kh)h can be construed as the area of this union of 
these rectangles and, as such, exceeds the area under y = ¢(x). So 
this time we obtain )°?.y @(kA)h = fo” o(x)dx. 

Combining these two inequalities tells us that the Riemann sum 
lies within h - (0) of the Riemann integral. This is all very nice and 
rather accurate but it refers only to decreasing functions. However, we 
may easily remedy this restriction by subtracting two such functions. 
Thereby we obtain 


Y 1g (kh) — w(kh)jh — i; [P(x) — W(x)] « hl dO) + W(0)]. 
k=1 0 


Calling (x) — w(x) = F(x) and then observing that #(0) + (0) 
is the total variation V of F(x) we have the rather general result 


ba F(kh)h — [ F(x) <h- V(F). (4) 
k=1 0 


To be sure, we have proven this result only for real functions but 
in fact it follows for complex ones, by merely applying it to the real 
and imaginary parts. 


Riemann Sums DAI 


To modify this result to fit our situation, let us write w = he’®, 
h > 0, —2/2 < 6 < 2/2, and conclude from (4) that 


S> F(khe)h — / F(xel)dx <h- Vo(F) 
k=1 0 
(V, is the variation along the ray of argument 0), so that 
S > F(kw)w — / F (xe")d(xel’) < w+ Vo(F). 
k=] 0 


Furthermore, in our case of an analytic F, this integral is actually 
independent of 0. (Simply apply Cauchy’s theorem and observe that 
at oo F falls off like +). We also may use the formula V,(F) = 
Jo” |F'(xe'®)|dx and finally deduce that 


S > F(kw)w -f F(x)dx « wf |F’ (xe!) |dx. 
k=1 0 


0 


Later on we show that 


is 1 1 ‘i Ct axe , 1 5) 
— = 1o ——— 
0 ex — 1 x 2 x : / 2m 


and right now we may note that the (complicated) function 


_ yi _y,i8 
2 xe e*e 
3 @3i0 2x2 e218 Axel? 
1 exe 


= x2¢29 (exe” — 1) = xei9 (exe? — 1/2 


F'(xe’®) = 
x 


id 


is uniformly bounded by ~}pz in any wedge |0| < c < m/2(m + 
M(c)), so that we obtain 


= Aa 1 ew 
7 —log —= K Mw © 
Si (aa kw =| SP fogy rh 


throughout | arg w| < c < 7/2. 
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The Approximation. We have prepared the way for the useful ap- 
proximation to our generating function. All we need to do is combine 
(1), (3), and (6), replace w by log t , and exponentiate. The result is 


= 1 
Hy 


f=1 = 
l-z mu 
= exp ( ——— ] [1+ 00 - 
2n (7) \ 2)] 
z 
a 3 
<C¢ 
h=(z) 


But we perform one more “neatening” operation. Thus log t is 
an eyesore! It isn’t at all analytic in the unit disc, we must replace 
it (before anything good can result). So note that, near 1, log : = 


(=z)? (l-z)? = 71] : 
(ie + SP 4...=2E2 + O((1 -2)%), or 


log 4 
1 it i 
1 ae + O(1 — z). Finally then, 


a 2 
= : Zo ( 5 P42) 0+ 00-1 (7) 


20 12-4 
en | oe 
in < 
al 
This is our basic approximation. It is good near z = 1, which 


we have decided is the most important locale. Here we see that 
we can replace our generating function by the elementary function 


4) 2 exp ( a ie ) whose coefficients should then prove amenable. 


12 1-z 

However, (7) is really of no use away from z = 1, and, since 
Cauchy’s theorem requires values of z all along a closed loop sur- 
rounding 0, we see that something else must be supplied. Indeed we 
will show that, away from 1, everything is negligible by comparison. 
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To see this, let us return to (2) and conclude that 


Iz|/ de 
- < ae 
ep sag 


j=2 J lee Iz|/ 


ph 
tt 
a3 
1 aE? 1 
F(z) <K vo(7ta+(F-)) ta): (8) 


an esate which is just whet we need. It shows that, away from 1, 


where rey is smaller than =~ - c F(z) is rather small. 


Thus, for example, we obtain 


ih 
log F(z) — = <K 


or 


1 1— 
F(z) < exp when | ZI avs (9) 
[Lg 1 — [g| 
Also, in this same region, setting 
loz a 
() = > & 5 2) - dain ; (10) 
a) «fe (= 2) com (= 
— exp | — —— exp | — ————— 
‘ oF VO bee e\ 19: 8a) 


so that 


pz) <« exp ( ) when = vl ees (11) 
z 


Le=lz{ 


The Cauchy Integral. Armed with these preparations and the 
feeling that the coefficients of the elementary function ¢ (z) are acces- 
sible, we launch our major Cauchy integral attack. So, to commence 
the firing, we write 


_ 1 [FO-6, 
pn) — ain) = = f OPO a any 
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and we try C acircle near the unit circle, i.e., 


© ase) Seer 1. (13) 


[1=z| 
1-|zI| ” 


Next we break up C as dictated by our consideration of 
namely, into 


1— 
A isthearc |z| =r, | ah 3s 
1 — |z 
and (14) 
fx 
Bo is theare- ||z|=-7; | aS 3. 
1 — |z 
So, 
p(n) — q(n) (15) 
1 F(z) — o(z) 1 i F@)=@@) 
= d dz, 
Qni i} ntl aT OF B ees : 


and if we use (7) on this first integral and (9), (11) on this second 
integral we derive the following estimates: 


I [ta 
A 


20i grt 


/ 2 


M 1 
<K (1 —r)’” exp & +) x the length of A. 
r 


rrtl 61- 


(M’ is the implied constant in the O of (7) when c = 3). 
As for the length of A, elementary geometry gives the formula 


V2(1 — 1) 


r 


4r arcsin 


and this is easily seen to be O(1 — r). We finally obtain, then, 
1 F(z) - 
i GZ) =o@) dz 
A 


ban, (m2 4 


where M is an absolute constant. 
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1 
Inrn 1 2ep (7) -20r 
2 1 
SSF = EX : 
a 4 l-r 
And this is even smaller than our previous estimate. So combining 
the two gives, by (15), 


75/2 | 


For the second integral, 


[= ee) & 
B rae 


2m i 


6 1 


But what is r? Answer: anything we please (as long as 0 < r < 

1)! We are masters of the choice, and so we attempt to minimize 

the right-hand side. The exact eine is too complicated but the 

approximate one occurs when tt =a exp (=; eee =) is minimized and 
med 


this occurs when Bie aS r)ie,r = 1— Fe: So we 


choose this r and, by so doing, we obtain, from (17), the bound 


p(n) = q(n) + 0 (wsterviv) (18) 


The Coefficients of g(n) 


The elementary function $(z) has a rather pleasant definite integral 
representation which will then lead to a handy expression for the 


q(n). 
If we simply begin with the well-known identity 


i edt = Ja 
and make a linear change of variables (a > 0), 


ee) 

2 af 

/ eat) dt = +, 
—CO 


a 
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or 
CO 
242 JIT 712 
/ e a‘t et2aht dt an ee : 
es a 


Thus if we set b? = x p> and a? = 1 — z (thinking of ¢ as real 


1 
(|z| < 1) for now), we obtain 


[o,@) 
2 Dia 352 
/ ptiottal tte gy 
—Co 


xa | 
exp [| — , 
Viar" k 6 b= 
which gives, finally, 


—n7/12 


es 2 2 
(1 — af et etV i? at. (19) 
n/2 —0o 
Equating coefficients therefore results in 


eo 12 poe E p2n-2 


o(z) = 


A oar as ae (rT ae 


the “formula” for g(n) from which we can obtain asymptotics. 
Reasoning that the maximum of the integrand occurs neart = ./n 
we change variables by t = s + ./n, and thereby obtain 


Je 2/31 qe (20) 


2 
q(n) = c.f K,(syose 2 8) as: (21) 
where 
e” 2n/3 nt 3 


1 ans eae aL 
KiGy= 2H, | (1+ ea ** | 
(1 i =) 


Since K,,(s) — 1, we see, at least formally, that the above integral 
approaches 


2 
Ko —2( 5— =z oY 8 2 
2se ( x) ds = iy (« + =) e"“ du, 
[. a 2/3 
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where we have set 5 = B a sya Furthermore, since ue~"” is 
odd, it is equal to = i bee edu = =. Thus (21) formally 


2/3 * 
becomes 
e™V2"/3 /Onnn" 
4/3n etn! — 


And score another one for Stirling’s formula, which in turn 
gives 


q(n) ~ (22) 


e™ /2n/3 
4/3n ” 


and our earlier estimate (18) allows us thereby to conclude that 


qu): (23) 


en 2n/3 


4,./3n 


Success! We have determined the asymptotic formula for p(n)! 
Well, almost. We still have two debts outstanding. We must justify 
our formal passage to the limit in (21), and we must also prove our 
evaluation (5). So first we observe that xe~* is maximized at x = 1, 
so we deduce that 


(24) 


p(n) ~ 


Ss fai 
1+ —])ev <1 25 
(1+ F)e# = = 
(using x = (1 + a and also 


= 2 
evr <en (26) 


pe 
Jn 


(using x = (1 + +-)’). 
Thus using (25) for positive s, by (21), 


K,(s)<e° for s>0, 
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and using (26) for negative s gives us 


2n—2 
S = ) 
e n 
va 


n 


[K,(s)| < 1 —s)e ( nigra 


= 2 
< (1—s)e* ~ Yi eats 


= (1—s)e5 241-(1+8/./n)? 
or 
[Kn(s)| < (1 —s)e**! for s <0. (28) 
Thus (27) and (28) give the bound for our integral in (21) of 


, 2 
2se -2(s- sig) for s > 0, 
and 
Iss = Ve NA: for. “52-0: 


This bound, integrable over (—oo, 00), gives us the required 
dominated convergence, and the passage to the limit is indeed 
justified. 

Finally we give the following: 


Evaluation of our Integral (5). To achieve this let us first note that 
as N — oo our integral is the limit of the integral 


ee 1 1 e*\ dx 
_ —Nx = ah 
i ( 7 (= —1 x 2 ) x 


(by dominated convergence, e.g.). But this integral can be split into 


[ra _ ane) (= 1 -z)\¢ dx ea (i= pon) e* as 
0 = 


Next note that 


=" 3k 1 
dep eee. / tell-O¥ gp 
0 


x2 
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and 


et — e N+Dx N+1 
= e ds. 
x 1 


Hence, by Fubini, we may interchange and obtain, for our expression, 
the elementary sum 


oY ae: Leas 
-y f dt + i — 
k—] 40 k+t-1 2 1 Ss 
2S k—1)1 : 1 eas (N + 1) 
= ( ) log ad 5 08 


k=1 
N 

— Sok — l)logk — (k — 1) log(k —1)-—N 
k=1 


1 
a 5 log(N + 1) 


= N log N — logN — log(N — 1) —---—log1—WN 


1 
+ 5 log(N + 1) 


1 
= Nlog N — log N!— N+ 5 log(N + 1). 


What luck! This is equal to log CN and so, by Stirling’s 
formula, indeed approaches log Tz 

(Stirling’s formula was used twice and hence needn’t have been 
used at all! Thus we ended up not needing the fact that C = /27 
in the formula n! ~ C./n(n/e)" since the C cancels against a C in 
the denominator. The n! formula with C instead of 27 is a much 


simpler result.) 
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Problems for Chapter I 


iy 


Explain the observation that MacMahon made of a parabola when 
he viewed the list of the (decimal expansions) of the partition 
function. 


. Prove the “simple” fact that, if order counts (e.g., 2 + 5 is consid- 


ered a different partition of 7 than 5 + 2), then the total number 
of partitions on n would be 2”~!. 


. Explain the approximation “near 1” of log : as 2 iz a o(( — 


z)*). Why does this lead to 


Ee eS 
logit = 21-2 


+ OUr=2Z)? 


. Why is the Riemann sum such a good approximation to the in- 


tegral when the function is monotone and the increments are 
equal? 


Il 
The Erdos—Fuchs Theorem 


There has always been some fascination with the possibility of near 
constancy of the representation functions r;(n) (of I (7), (8) and (9)). 
In Chapter I we treated the case of r;(n) and showed that this could 
not eventually be constant. The fact that r(n) cannot be constant for an 
infinite set is really trivial since r(n) is odd forn = 2a,a € A, and 
even otherwise. The case of r_(n) is more difficult, and we will treat 
it in this chapter as an introduction to the analysis in the Erdos—Fuchs 
theorem. 

The Erdés—Fuchs theorem involves the question of just how nearly 
constant r(n) can be on average. Historically this all began with the 
set A = {n” : n € No}, the set of perfect squares, and the observation 
that then “O#"O#"COF") | the average value, is exactly equal to 
< times the number of lattice points in the quarter disc x, y > 0, 
x* + y* < n. Consideration of the double Riemann integral shows 
that this average approaches the area of the unit quarter circle, namely 
x /4, and so for this set A, HO — 7 (r(n) is on 
average equal to the constant 7/4.) 

The difficult question is how quickly this limit is approached. Thus 
fairly simple reasoning shows that 


r(0) +r) +rQ)+---+r(n) _2 +0( 1 ). 
n+1 4 


whereas more involved analysis shows that 


r(0) +r(1) +r(2) +---+7r(n) _i +0( 1 ) 
n+1 4 
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Very deep arguments have even improved this to o (ar), for ex- 


ample, and the conjecture is that it is actually O ( - for every 
n 


€ > 0. On the other hand, further difficult arguments show that it is 


not O | —— 
nate 


Now all of these arguments were made for the very special case 
of A = the perfect squares. What a surprise then, when Erdds and 
Fuchs showed, by simple analytic number theory, the following: 


Theorem. For any set A, POO") ~C+O ( ! ) is 


3 
n+ nate 


impossible unless C = 0. 


This will be proved in the current chapter, but first an appetizer. 
We prove that r_(n) can’t eventually be constant. 
So let us assume that 


A’(z) — A(z’) = PQ) + — (1) 


P is a polynomial, and C is a positive constant. Now look for a con- 
tradiction. The simple device of letting z — (—1)* which worked 
so nicely for the r; problem, leads nowhere here. The exercises in 
Chapter I were, after all, hand picked for their simplicity and involved 
only the /ightest touch of analysis. Here we encounter a slightly heav- 
ier dose. We proceed, namely, by integrating the modulus around a 
circle. From (1), we obtain, forO < r < 1, 


[ ce"a6 


Tv 


sf acteya0+ [ \Pwe*a0 (2) 


a 1s 


“pe 
_q ll —rei®|” 
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Certain estimates are fairly evident. P(z) is a polynomial and so 
: |P(re'’)|d@ < M, (3) 


independent ofr (0 < r < 1). 
We oan also estimate the (elliptic) integral (7 —4, = 


x |l— my 
Z i = a a by the observation that if z is any complex number in 
the first quadrant then |z| < %tz + Sz. Thus since for! =. 8-7; 


ie’ 


id 
1-=re® isin ng first quadrant, =;—- = [op also is, and 


—— < (R + 9) (4 ay Hence 


|l-re-#| 


a r 


———— <(#t+9 
i eye e » | 


(R + 3) (loge = r)) . 


1 
(+ 9) log (— **) 
l-r 


( l-+r ) 
x + log ‘ 
l-r 
The bound, then, is 


do l+r 
i, ow, £2 +2I0g(7=*). (4) 


The integral [”_ |A(re’’)|?d6 is a delight. It succumbs to Parseval’s 
identity. This is the observation that 


[ | So ane” do = . ya S > nei’ 
= i : yaa ae 


uT min 


= Sande fa 
n,m 


=TJE 


and these integrals all vanish except that, when n = m, they are 
equal to 27r. Hence this double sum is 27 5~ |a,,|?. The derivation is 
clearly valid for finite or absolutely convergent series which covers 
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our case of A(re’”) (but it even holds in much greater “miraculous” 
generalities). 
At any rate, Parseval’s identity gives us 


[ |A(rel’) dO = 20 Sor = 20 AC’). (5) 
= acA 

The last integral we must cope with is [” |A(r?e”*)|d6, and, 
unlike integrals of | f|?, there is no formula for integrals of | f|. But 
there is always the Schwarz inequality [| f| < ({1-f |f|?)', and 
so at least we can get an upper bound for such integrals, again by 
Parseval. The conclusion is that 


i |A(r7e")|d0 < 2nV/ A(r4). (6) 


All four of the integrals in (2) have been spoken for and so, by (2) 
through (6), we obtain 


A(r?) < JA) + a aa ee tog (>=). (7) 


l-—r 


It is a nuisance that our function A is evaluated at two different 
points, but we can alleviate that by the obvious monotonicity of A, 
A(r*) < A(r), and obtain 


KG) & JAG A = < toe ( “ry. (8) 


r 


Is something bounded in terms of its own square root? But ifx < 
J/x +a, we obtain (./x — 5) <ati,J/x <a+ft+hx0< 
a+ ; Sy ee i. This yields a pure bound on x. Then 


C 1 Cc 1 
A(r?) < M’+— log (= = “4 M+ | log (= asc “), (9) 
a —r V 


But, so what? This says that A(r?) grows only at the order of 
log + , but it doesn’t say that A(r?) remains bounded, 
does it? Wherein is the hoped contradiction? We must revisit (1) 
for this. Thereby we obtain, in turn A*(r?) — A(r*) = P(r?) + 
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<,AX(r’) > P(r?) + (5, A720’) = —M 4+ Ss, and finally 


A(r’) > |-™ + gaat (10) 
l1-r 


a rate of growth which flatly contradicts (9) and so gives our desired 
contradiction. 

If this proof seems like just so much sleight of hand, let us ob- 
serve what is “really” going on. We find ourselves with a set A 
whose 7;(n) is “almost” constant and this means that A?(z) * iS : 


On the one hand, this forces A(z) to be large on the positive axis 


(40 > 


integral of |A*(z)| is A(r?) and | S | (being fairly small except near 


), and, on the other hand Parseval says that the 


1) has a small integral, only O (log an ). (So A(r?) < C” log <). 

In cruder terms, Parseval tells us that A(z) is large on average, 
so it must be large elsewhere than just near z = 1, and so it cannot 
really be like = (Note that the “elsewhere” in the earlier r,(n) 
problem was the locale of —1, and so even that argument seems to 
be in this spirit.) 

So let us turn to the Erdés—Fuchs theorem with the same strategy 
in mind, viz., to bound A(r 


and then to bound it above by Parseval considerations. 


Erdos—Fuchs Theorem 


We assume the A is a set for which 
r(0) +r(1)+---+r(n) = C(n+1)+ O(n’), C>O0, (1) 


and we wish to deduce that a > i. As usual, we introduce the 
generating function A(z) = )°,., 2“, so that A*(z) = Do r(n)z", 
and therefore ;+. MG) = Vir) + rl) +--+ + r(n)]z". Since 
Yiat+)z" = a = our hypothesis (11) can “ written as 


Yo = aoe + 5. ap =O), 
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or 


C CO 
A’(2) = —— + -2) a2", ay = O(n"). (12) 
i a4 n=0 
Of course we may assume throughout that a < 1. Thereby (12) 
yields the bound M(1—r?)~*~! for }* a,r?”, so that we easily achieve 
our first goal namely, 


2 C’ ' 

A(r*) > Spa C’ > 0. (13) 
As for the other goal, the Parseval upper bound on A(r’), again 
we wish to exploit the fact that A*(z) is “near” ><, but this takes 
some doing. From the look of (12) unlike (1), this “nearness” seems 
to occur only where (1 — z) >) a,z” is relatively small, that is, only 
in a neighborhood of z = 1. We must “enhance” this locale if we are 
to expect anything from the integration, and we do so by multiplying 
by a function whose “heft” or largeness is all near z = 1. A handy 

such multiplier for us is the function S*(z) where 


Si) =l4tz4+2°4+---+2%!, N large. (14) 
The multiplication of S7(z) by (12) yields 


CS?(z) 


[SAY = i 
eZ 


+ (1—2%)S(@) Do anz”, (15) 


which gives 


2 


ISA < 
[A= | 


+ 2|S(2) Yo anz"|, (16) 


and integration leads to 


i \S(re’”) A(re’’) |?de 


4 


a do 
< cn? f pe (17) 


7 ll — rei?| 


+2 [S(re’) San (re’’)"|d0. 


Erdés—Fuchs Theorem 37 


As before, we will use Parseval on the first of these integrals, (4) 
on the second, and Schwarz’s inequality together with Parseval on 
the third. 

So write S(z)A(z) = So cnz", and conclude that /”_ |S(re') 
A(re’’)|?d0 = 2m ¥- |c,\*r?". Since the c, are integers, |c,|? = 
c? > c, andso this is, furthermore, > 27 )°c,r7" = 22 S(r?)A(r?). 
(The general fact then is that, if F(z) has integral coefficients, 
f", Fre) ?do > 20 F(r?).) 

Now we introduce a side condition on our parameters r and N 
which we shall insist on henceforth namely that 


1 


1-—r2 


== AN (18) 


Thus, by (14), S(r?) > Nr?N > NU-+)" => NU-3) = 4, 
and by (13), A(r?) > — , and we conclude that 


Vien 


a ; ; C"N 
i |S(re’) A(re’®) |7do > a C0; (19) 


x l-r 
Next, (4) gives 


7 qo 
CN? / __ Oo" — < MN? log ——— (20) 


7 j|l—re®| — 1 —r? 


and our last integral satisfies 

< If jswelao f Do antrel#) 

= Dir i> 2k > la, eres 2n/NM,/ > nr", 
k<N 


Applying (13) and (14) again leads finally to 


i S(re'’) > a,(re’’)" 


TU 


dé 


S(re’’) > an(re'’)" 


2 
dé 


dé < (i Ma (21) 


= r2)e+1/2 . 
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At last, combining (19), (20), and (21) allows the conclusion 


ce e 1 

— <NvV1-r’lo + : 22) 

M eT-P JN(1 — rr?) 
Once again we are masters of the parameters (subject to (18)), 


and so we elect to choose r, so that NV/1 — r? = Peck Thus 
1 


our choice is to make a = N m1 and note happily that our side 
condition (18) is satisfied. Also “plugging” this choice into (22) gives 


W 


ae N © (2 + 3 log N). (23) 


Well, success is delicious. We certainly see in (23) the fact that 
a> i. (If the exponent of NV, eS , were negative then this right- 


hand side would go to 0, 2+3 log N notwithstanding, and (23) would 
become false for large N.) 
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Problems for Chapter HI 


1. Show that the number of lattice points in x7 + y? < n*,x, y > 0, 
is & n. By the Riemann integral method show that it is, in fact 
= Fn? + O(1). 


2. If x is bounded by its own square root (i.e., by ./x + a), then we 
find that it has a pure bound. What if x, instead, is bounded by 
x7/3 + ax'/3 + b? Does this insure a bound on x? 


3. Suppose that a convex closed curve has its curvature bounded by 
5. Show that it must come within 2/6 of some lattice point. 


4. Produce a convex closed curve with curvature bounded by 6 which 


Oy . . é . . 
doesn’t come within we of any lattice point. 


IV 


Sequences without Arithmetic 
Progressions 


The gist of the result of Chapter IV is that a sequence of integers 
with “positive density” must contain an arithmetic progression (of at 
least three distinct terms). 

More precisely and in sharper, finitized form, this is the statement 
that, ife > 0, then for large enough n, any subset of the nonnegative 
integers below n with at least €n members must contain three terms 
a,b,c wherea < b < canda+c = 2b. This is a shock to nobody. 
If a set is “fat” enough, it should contain all sorts of patterns. The 
shock is that this is so hard to prove. 

At any rate we begin with a vastly more general consideration, the 
notion of an “affine property” of finite sets of integers. So let us agree 
to call a property P an affine property if it satisfies the following two 
conditions: 


1. For each fixed pair of integers a, B witha # 0, the set A(n) has 
P if and only ifwA(n) + B has P. 
2. Any subset of a set, which has P, also has P. 


Thus, for example, the property P, of not containing any arith- 
metic progressions is an affine property. Again the trivial property 
Po of just being any set is an affine one. 

Now we fix an affine property P and consider a /argest subset of 
the nonnegative integers below n, which has P. (Thus we require 
that this set has the most members possible, not just to be maximal.) 
There may be several such sets but we choose one of them and denote 
it by S(n; P). We also denote the number of elements of this set by 


Al 
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f(n; P). So, for example, for the trivial property, f(n; Po) = n, and 
for Pa, f (3; Pa) = 2, f(5; Pa) = 4. 

It follows easily from conditions | and 2 that this f(m) is sub- 
additive, 1.e., f(m +n) < f(m) + f(n). If we recall the fact 
that subadditive functions enjoy the property that lim,_... / is ex- 
ists (in fact limy-. 46 fw = inf Lm), we are led to define Cp = 
fim} 3.65 fury This number is a measure of how permissive the 
property P is. Thus Cp, = 1, because Pp is totally permissive. The 
announced result about progression = free sequences amounts to the 
statement that Cp, = 0, so that Py, is, in this sense, totally unper- 
missive. At any rate, we always have 0 < Cp < 1, and we may dub 
Cp the permission constant. 

The remarkable result proved by Szemerédi and then later by 
Furstenberg is that, except for Py, Cp is always 0. Their proofs are 
both rather complicated, and we shall content ourselves with the case 
of P4, which was proved by Roth. 


The Basic Approximation Lemma 


It turns out that the extremal sets S$(n; P) all behave very much as 
though their elements were chosen at random. For example, we note 
that such a set must contain roughly the same number of evens 
as odds. Indeed if 2b,, 2bo,..., 2b; were its even elements, then 
bi, bz, ..., be would be a subset of (0, 4) and so we could conclude 
that k < f(4). Similarly the population of the odd elements of 
S would satisfy this same inequality. Since > ~ ‘ f(n), we con- 
clude that both the evens and the odds contain not much more than 
half the whole set. Thereby the evens and the odds must be roughly 
equinumerous. (Thus, two upper bounds imply the lower bounds.) 
Delaying for the moment the precise statement of this “random- 
ness,” let us just note how it will prove useful to us with regard to 
our arithmetic progression considerations. The point is simply that, 
if integers were chosen truly at random with a probability C > 0, 


there would automatically be a huge number of arithmetic progres- 
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sions formed. So we expect that even an approximate randomness 
should produce at least one arithmetic progression. 
The precise assertion is that of the following lemma. 


Lemma. ojesin.p) 2 = CP doken z‘ 4 o(n), uniformly on|z| = 1. 


Remark. In terms of the great Szemerédi—Furstenberg result that 
Cp = 0(except for P = Po), this is a total triviality. We are proving 
what in truth is an empty result. Nevertheless we are not prepared 
to give the lengthy and complex proofs of this general theorem, and 
sO we must prove the Lemma. (We do what we can.) The proof, in 
fact, is really just an elaboration of the odds and evens considerations 
above. 


PROOF. The basic strategy is to estimate g,(z) = )oje5 2" — 
Cp >-,_, 2, together with all of its partial sums at every root of 
unity of order up to N (N is a parameter to be chosen later). The 
point is that, if we have a bound on a polynomial and its partial sums 
at a point, then we inherit a bound on that polynomial throughout 
an arc around that point. (Thereby we will obtain bounds for arcs 
between the roots of unity which will fill up the whole circle.) 
Specifically, we have the identity 


p() oN 2 Dee Pee 
=> pnd) (= =). a 


c m<n 


for any polynomial p of degree at most n, where the p,, denote the 
partial sums. (This simply records the result of the “long division.”) 
From (1) we easily obtain the bound |p(z)| < |¢ — z|>o,,_, 
| Pm()| + | p(f)|, and so we conclude the following: 
If all the partial sums are bounded by M at ¢, the polynomial is 
bounded by M(né + 1)throughout an arc of length 22 (2) 


centered at ¢. 
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So let a < N be chosen, and let w be any ath root of unity, i.e., 
w* = |. To estimate g,,(w), let us write it as 


counts the size of a subset of S, which therefore has P which is affine 
to a subset of (0, “), and so has at most f (“) elements (where we 


write f(x) for f([x])). 
Thus 
dn(o) = — >> o® (F (=) = on) 
p= a 
+ Yo 1(=) 6s ye 1 
p= ° bien 
= m e m m 
<¥l(Z)-|+D|r(F)-e|F lo 
° m a m m 
£0(3)-) 2068-99 


= 2af (=) = Y of = Cpm. 
a B=l 


If we next note that ae Op is exactly the number of elements of 
S which are below m and so is equal to f() minus the number of 
elements of S which are > m, we obtain 


Sop = f(n)— ftr—m) > Cpn- fin—m). (4) 
p=1 
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Substituting (4) in (3) gives 


m m 
dm(@) K 2a E (=) = Co™ [+f m)—Co(n—m)) (5) 


Now we find it useful to replace the function f(x) — Cpx by its 
“monotone majorant” F(x) = max,<,(f (t) — Cpt) and note that this 
F(x) is nondecreasing and satisfies F(x) = o(x) since f(x) — Cpx 
satisfies the same. So (5) can be replaced by 


dn(@) K 2aF (=) FG) eer (=) +F(n) (6) 


(a bound independent of m). 

So choose no so that x > no implies F(x) < ex, and then choose 
n, so that x > n, implies F(x) < ee From now on we will pick 
n > n, and also will fix N = [7]. 

Dirichlet’s theorem! on approximation by rationals now weve us 
that the totality of arcs surrounding these @ with length 2 —7,) i fe =D 
covers the whole circle. Thus using (2) for g(z), ¢ = wand @ = 


20 azi9 
2aNED = <2 gives 


q(2) & aF (=) + F(n)] (: + 2) (7) 


We separate two cases: 
Case I: a < no. Here we use F(") < F(n) and obtain [20 F(*) + 
F(n)|\+ 28) < Qa+D(0+ =2)F(n) < 30(1+ 22) F(n) = 
(3a + 67N0)F(n) < (67 + 3)noF(n) < (67 + 3)no 5 < 22€n. 
Case II: @ > no. Here [20F (4) + F(n)](1 + 2) < 2a F (2) + 
F(n)]( + 27). But stilla < 7, of pa Njd0. T(x es and 
the above is < (2en + en)(1 + 27) = (34+ 6m)Een < 22€n. 

In either case Dirichlet’s theorem yields our lemma. 

So let P be any affine property, and denote by A = A(n; P) the 
number of arithmetic progressions from $(n; P)(where order counts 


'Dirichlet’s theorem can be proved by considering the powers 1, z, z”,---,z% for z 
any point on the unit circle. Since these are N + | points on the circle, two of them 
z!, j/ must be within arc length = of one another. This means | arg zi~/| < = 


N+I 
and calling |i — j| = @ gives the result. 
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and equality is allowed). We show that 


3 
A(n; P) = <P? + o(n’). (8) 


The proof is by contour integration. If we abbreviate )) 2. z= 
g(z), then we recognize A as the constant term in g(z)g(z)g(z~7), 
and so we may write 


Belz 
= oe gs (z)g(z oo eee (9) 
TTL |z|=1 zj 
Now writing G(z) = >>, _, z*, g(z) = CpG(z) + q(z) (where g 
is “small” by the lemma). If we substitute this in (9), we obtain 


Ge GWE?) & 

2mt Jiz\=1 zZ 
plus seven other integrals. Each of these other integrals is the product 
of three functions, each a G or aq, and at least one of them is ag. By 
our lemma, then, we may estimate each of these seven integrals by 
o(n) times an integral of the product of two functions. Both of these 
functions are either a |G| or a |g|. As such each is estimable by the 
Schwarz inequality, Parseval equality techniques. The final estimate 
for each of these seven integrals, therefore, is o(n),/nn = o(n’), 
and so (9) gives 


Ae ae G*(z)G(z~*) as + o(n’). (10) 
27 Jigai z 

But reading (9) for the property Po shows that this integral is 

simply A(n; Po) and it is a simple exercise to show that A(n; Po), 

the number of triples below n which are in arithmetic progression, 

is exactly pap Indeed, then (10) reduces to (8). Q.E.D. 


All of our discussion thus far has been quite general and is valid 
for arbitrary affine properties. We finally become specific by letting 


P = Py, and we easily deduce the following: 


Theorem (Roth). Cp, = 0. 
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PROOF. By the definition of Py, the only arithmetic progressions 
in S(n; P4) are the trivial ones, three equal terms, which number is 
at most n. Thus A(n; P4) < n, and so, by (8), Ce m + o0(n*) <n. 
Theretore:C a, = "0. 
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Problems for Chapter IV 


IF 


Attach a positive rational to each integer from 1 to 12 so that all 
A.P.’s with common difference d up to 6 obtain their “correct” 
measure 5. 


. Prove that, if we ask for a generalization of this, then we can only 


force the correct measure 4 for all A.P.’s of common difference 


d, by attaching weights onto 1, 2,...,n, ifd = O(,/n). 


. If we insist only on approximation, however, show that we can 


always attach weights onto 1, 2,...,m such that the “measure” 
given to every A.P. with common difference < m is within e~”/” 
of +. 

a 


V 
The Waring Problem 


In a famous letter to Euler, Waring wrote his great conjecture about 
sums of powers. Lagrange had already proved his magnificent the- 
orem that every positive integer was the sum of four squares, and 
Waring guessed that this was not just a property of squares, but that, 
in fact, the sum of a fixed number of cubes, fourth powers, fifth pow- 
ers, etc., also worked. He guessed that every positive integer was 
the sum of 9 cubes, 19 fourth powers, 37 fifth powers, and so forth, 
and although no serious guess was made as to how the sequence 4 
(squares), 9, 19, 37, ... went on, he simply stated that it did! That 
is what we propose to do in this chapter, just to prove the existence 
of the requisite number of the cubes, fourth powers, etc. We do not 
attempt to find the structure of the 4, 9, 19, ..., but just to prove its 
existence. 

So let us fix k and view the kth powers. Our aim, by Schnirel- 
mann’s lemmas below, need be only to produce a g = g(k) and an 
a = a(k) > 0 such that the sum of g(k) kth powers represents at 
least the fraction a(k) of all of the integers. 

One of the wonderful things about this approach is that it requires 
only upper bounds, despite the fact that Waring’s conjecture seems 
to require lower bounds, something seemingly totally impossible 
for contour integrals to produce. But the adequate upper bounds are 
obtained by the so called Weyl sums given below. 

So first we turn to our three basic lemmas which will eventually 
yield our proof. These are A, the theorem of Dirichlet, B, that of 
Schnirelmann, and finally C, the evaluation of the Weyl sums. 
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A. Theorem (Dirichlet). Given a real x and a positive integer M, 
there exists an integer a and a positive number b < M such that 


a 1 
ae Oreo pas 
PROOF. Consider the numbers 0, x, 2x, 3x,..., Mx all reduced 
(mod 1). Clearly, two of these must be within ram of each other. 
If these two ene by bx, then 1 < b < M and bx (mod 1) is, in 


magnitude, < 75 Tae . Next pick an integer a that makes bx — a equal 
to bx (mod 1). So |bx —a| < Gea which means |x — ¢| < TES 
as asserted. 


We alse point out that this is a best possible result as the choice 
i wo = Shows for every M. (Again, we may assume that (a, b) = 
1 for, if they have a common divisior, this would make the inequality 


|b| < M even truer). 


B. Schnirelmann’s Theorem. /f S is a set of integers with positive 
Schnirelmann density and 0 € S, then every non-negative integer is 
the sum of at most k members of S for some k > 1. 


Lemma 1. Let S have density a and0 € S. Then S ® S has density 
at least 2a — a’. 


PROOF. All the gaps in the set S are covered in part by the translation 
of S by the term of S just before this gap. Hence, at least the fraction 
a of this gap gets covered. So from this covering we have density a 
from S itself and a times the gaps. Altogether, then, we indeed have 
a+a(l—a) = 2a — a’, as claimed. 

Lemma 2. /f S has density a > ‘, then S ® S contains all the 
positive integers. 


PROOF. Fix an integer n which is arbitrary, let A be the subset of 
S which lies < n, and let B be the set of all m minus elements of 
S. Since A contains more than n/2 elements and B contains at least 
n/2 elements, the Pigeonhole principle guarantees that they overlap. 
So suppose they overlap at k. Since k € A, we getk € S, and since 
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k € B, we getn —k € S. These are the two elements of S which 
sum ton. 

Repeating Lemma | j times, then, leads to a summing of 2/ copies 
of S anda density of | — (1 — a)” or more. Since this latter quantity, 
for large enough /j, will become bigger than s, Lemma 2 tells us 
that 2’*! copies of S give us all the integers, just as Schnirelmann’s 
theorem claims. Q.E.D. 


C. Evaluation of Weyl Sums. Let b € Z, b # Oandk < N, 
P(n) be a polynomial of degree k with real coefficients and leading 
coefficient integral and prime to b, and let I be an interval of length 
< N. Then 


yie (=) K Nite) p-2'* 
nel b 


where the bound depends on k. 


Here — as usual — we denote e(x) = e?7"*. 


We proceed by induction on k, which represents the degree of 
P(n). It is clearly true for k = 1, and generally we may write 


s=yre(=) 


nel 
and may assume w.l.o.g. that 7 = {1, 2,3,..., N}. Thereby 
ie P(n) — P(n— j) 
sa Ep, «(tmazena) 
j=—-N4+1 ne{1,2,...,N} 


ne{ jt, j+2,.,J+N)} 


This inner sum involves a polynomial of degree (k — 1) but has a 
leading coefficient which varies with 7. If we count those j which 
produce a denominator of d, which of course must divide b, then we 
observe that this must appear roughly d times in an interval of length 


b. So this number of j in the full interval of length 2 + 1 is roughly 
QN+1) y 
ee 
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The full estimate, then, by the inductive hypothesis is 


N2t+0) 


ISP «Santas < 
d\b b d\b 


K NAOH 32 p, 
So we obtain 
S< NOM po set 
and the induction is complete. 


Now we continue as follows: 


Lemma 3. Letk > 1 be a fixed integer. There exists a C, such that, 
for any positive integers N, a, b with (a, b) = 1, 


v(F"+) 


w=) 


1-k 
< CN PeOp : 


Our endpoint will be the following: 


Theorem. /f for each positive integer s, we write 


a= 


k k_ 
n 1 +e bny Sn 
nj=0 


then there exists g and C such that r,(n) < Cn8/*"! foralln > 0. 


The previously cited notions of Schnirelmann allow deducing, the 
full Waring result from this theorem: 


There exists a G for which rg(n) > 0 for alln > 0. 
To prove our theorem, since 


1 s 
r,(n) -| b> ecvm')| e(—nx)dx, 
0 


me<ni/k 
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it suffices to prove that there exists g and C for which 


1 
i 

First some parenthetical remarks about this inequality. Suppose it is 
known to hold for some Cp and go. Then, since | 4 e(xn*)| < N, 
it persists for Co and any g > go. Thus (1) is a property of large g’s, 
in other words, it is purely a “magnitude property.” Again, (1) is a 
best possible inequality in that, for each g, there exists ac > 0 such 


that 
1 
i 


To see this, note that )~”_, e(xn*) has a derivative bounded by 
2x N**'. Hence, in the interval (0, -4;), 


N 


» e(xn*) 


n=l 


& 
dx < CN&* forall n> 0. (1) 


N 


y e(xn*) : 


al 


dx > cN®* forall n> 0. (2) 


N 


y e(xn*) 


n=l 


>N — anit _t_ = a 
= An N* 2 


and so (2) follows withe = =. 


The remainder of our paper, then, will be devoted to the derivation 
of (1) from Lemma 3. Henceforth k is fixed. Denote by J,.,,) the 
x-interval |x — $| < poh, and call J = N*\x — £|, j = [J], 
where a, b, N, j are integers satisfying N > 0,b > 0,0 <a <b, 
(a,b) =1,b < N¥?, 

By Dirichlet’s theorem, these intervals cover (0, 1). Our main tool 
is the following lemma: 


Lemma 4. There exists € > 0 and Cy such that, throughout any 
interval Iq.p.N, 


N 


> e(xn*) 


n=1 


C,N 
< ——. 
(b + j)é 
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PROOF. This is almost trivial if b > N°, for, since the derivative 
of | 32”, e(xn*)| is bounded by 2 N**, 


N 


Ss e(xn*) 


n=] 


N 


deen) 


w= 1 

Nite On N3/2 Nite) nN 
zs < —— + —,. 

pet b ~ para but 


= 


+ 


a 
— —|207N*! 
ct] 


= 


by C, which gives the result, since 7 = 0 automatically. Assume 
therefore that b < N7/?, and note the following two simple facts (A) 
and (B). For details see [K. Knopp, Theory and Application of Infinite 
Series, Blackie & Sons, Glasgow, 1946.] and [G. Polya und G. Szegé, 
Aufgaben und Lehrsatze aus der Analysis, Dover Publications, New 
York 1945, Vol. 1, Part II, p. 37]. Q.E.D. 


(A) If M is the maximum of the moduli of the partial sums )7"_, dn, 
V the total variation of f(t) inO < t < N, and M’ the maximum 
of the modulus of f(t) inO < t < N, then 


N 


So anf (n) 


n=) 


(B) If V is the total variation of f(t) inO < t < N, then 


N N 
Yrm-f- fear 
n=1 0 


b 
n 


<M(V+M’). 


<= V. 


: 1 
Now writea = + )>)_, e(én‘) and 


N 


> e(xn*) = S, + aSy, (3) 


n=] 


where 
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We apply (A) to S|. To do so, we note that 


y fe (Fx) -«]] = Jo ae fe (Sut) -a| 


=) 
(1 + |a|)b < 2b. 


lA 


Also, the total variation of e[(x — ¢)t*]is equal to 27 |x — ¢|N* < 
2a VN , whereas M’ = 1. The result is 


Si] < 42 /N + 2b < 50 N79, (4) 


Next we apply (B) to Sj and obtain 


So] < [- [(s Z | | dt| + eats (5) 
Since fe e(u*)du converges we get 
[- [(s — | | d= Firk i. e(u*)du| < i 
Combining this with (5) gives 
NT ney (6) 


S 
la So| < G+ jue 


Now if we apply Lemma 3 to the case N = b, we obtain |a| < 
5 = 2!-*, and by (3) the addition of (4) and (6) gives 


N 
CsN 
k 5 2/3 
Dean dN) = Ba + pve + 7nxN 
CsN CoN 


= d+ pT b+ pe 
Since j < /Nandb < N*/>,thechoice C) = Cs+Co+C)+2z, 


€ = min(5, 7, }) completes the proof. 
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Proof of (1). Choose g > 4 , € given as above. By Lemma 4, since 
the length of each J,» is at most 2N~, 


N & 


a> 


n=1 


C7;N& 1 


e(xn*) S (b+ jh NE 


Summing over all a, b, j gives the estimate 
C;Ns* SCNE 
7 >. ora at + (b + j)4 a 


since 15-1. }—0 as < oo, and the proof is complete. 
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Problems for Chapter V 


1. If we permit polynomials with arbitrary complex coefficients and 
ask the “Waring” problem for polynomials, then show that x is 
not the sum of 2 cubes, but it is the sum of 3 cubes. 


2. Show that every polynomial is the sum of 3 cubes. 


3. Show, in general, that the polynomial x is “pivotal,” that is if x is 
the sum of g nth powers, then every polynomial is the sum of g 
nth powers. 


4. Show that if max(z, b) > 2c, where c is the degree of R(x), then 
P* + Q° = Ris unsolvable. 


5. Show that the constant polynomial 1 can be written as the sum of 
/4n + 1 nth powers of nonconstant polynomials. 


Vi 


A ‘Natural’? Proof of the 
Nonvanishing of L-Series 


Rather than the usual adjectives of “elementary” (meaning not in- 
volving complex variables) or “simple” (meaning not having too 
many steps) which refer to proofs, we introduce a new one, “natural.” 
This term, which is just as undefinable as the others, is introduced to 
mean not having any ad hoc constructions or brilliancies. A “natural” 
proof, then, is one which proves itself, one available to the “common 
mathematician in the streets.” 

A perfect example of such a proof and one central to our whole 
construction is the theorem of Pringsheim and Landau. Here the cru- 
cial observation is that a series of positive terms (convergent or not) 
can be rearranged at will. Addition remains a commutative operation 
when the terms are positive. This is a sum of a set of quantities rather 
than the sum of a sequence of them. 

The precise statement of the Pringsheim—Landau theorem is that, 
for a Dirichlet series with nonnegative coefficients, the real boundary 
point of its convergence region must be a singularity. 

Indeed this statement proves itse/f through the observation that 
Hh =: SS (a-a (log n)* is a power series in (a — z) with non- 
negative coefficients. Thus the (unique) power series for }) a,n~* = 
y> a,n~“ - n*~ has nonnegative coefficients in powers of (a — 2). 
So let b be the real boundary point of the convergence region of 
>> a,n~, and suppose that b is a regular point and that b < a. Thus 
the power series in (a — z) continues to converge a bit to the left 
of b and, by rearranging terms, the Dirichlet series converges there 
also, contradicting the meaning of b. A “natural” proof of a “natural” 
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theorem follows, one with a very nice corollary which we record for 
future use. 

(1) If a Dirichlet series with nonnegative coefficients represents a 
function which is (can be continued to be) entire, then it is everywhere 
convergent. 

Our ultimate aim is to prove that the L-series have no zeros on 
the line Xz = 1. This is the nonvanishing of the L-series that we 
referred to in the chapter title. So let us begin with the simplest of 
all L-series, the ¢-function, ¢(z) = )° -. Our proof, in fact, was 
noticed by Narasimhan and is as follows: Assume, par contraire, that 
¢(z) hada zero at 1 +ia,a real. Then (sic!) the function ¢(z)¢(z+ia) 
would be entire. (See the appendix, page no. 63). 

The only trouble points could be at z = 1 or at z = 1 — ia where 
one of the factors has a pole, but these are then cancelled by the other 
factor, which, by our assumption, has a zero. 

A bizarre conclusion, perhaps, that the Dirichlet series ¢(z)¢(z + 
id) 1s entire. But how to get a contradiction? Surely there is no hint 
from its coefficients, they aren’t even real. A natural step then would 
be to make them real by multiplying by the conjugate coefficient 
function, ¢(z)¢(z — ia), which of course is also entire. We are led, 
then, to form ¢7(z)¢(z + ia)¢(z — ia). 

This function is entire and has real coefficients, but are they pos- 
itive? (We want them to be so that we can use (1).) Since these are 
complicated coefficients dependent on sums of complex powers of 
divisiors, we pass to the logarithm, 2 log ¢(z) + log ¢(z + ia) + 
log ¢(z — ia), which, by Euler’s factorization of the ¢-function, has 
simple coefficients. A dangerous route, passing to the logarithm, be- 
cause this surely destroys our everywhere analyticity. Nevertheless 
let us brazen forth (faint heart fair maiden never won). 

By Euler’s factorization, 2 log ¢(z) + log €(z + ia) + log ¢(z — 
ia)=)', (2 log = ; t log ea ee eras) a re 
(2 + pov + pried), and indeed these coefficients are nonnega- 
tive! The dangerous route is now reversed by exponentiating. We 
return to our entire function while preserving the nonnegativity of 
the coefficients. All in all, then, 
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(2) ¢7(z)¢(z+ia)e(z —ia) is an entire Dirichlet series with nonneg- 
ative coefficients. Combining this with (1) implies the unbelievable 
fact that 

(3) the Dirichlet series for ¢°(z)¢(z + ia)¢(z — ia) is everywhere 
convergent. 

The falsity of (3) can be established in may ways, especially if 
we recall that the coefficients are all nonnegative. For example, the 
subseries corresponding to n = power of 2 is exactly equal to 
ae - osea * poe which exceeds a - + along the 
nonnegative (real) axis and thereby guarantees divergence at z = 
0. Q.E.D. 

And so we have the promised natural proof of the nonvanishing of 
the ¢-function which can then lead to the natural proof of the Prime 
Number Theorem. We must turn to the general L-series which holds 
the germ of the proof of the Prime Progression Theorem. Dirichlet 
pointed out that the natural way to treat these progressions is not 
one progression at a time but all of the pertinent progressions of a 
given modulus simultaneously, for this leads to the underlying group 
and hence to its dual group, the group of characters. Let us look, for 
example, at the modulus 10. The pertinent progressions are 10k + 1, 
10k + 3, 10k + 7,10k + 9, so that the group is the multiplicative 
group of 1,3,7,9 (mod 10). The characters are 


x2 xd) = 1, x13) = 1, x17) = 1, x19) = 1, 
x3: x31) = 1, x3(3) = 1, x3(7) = 1, x39) = 1, 
x7 x7) = 1, x73) = 1, x77) = 1, x79) = 1, 
x9: Xo) = 1, X93) = 1, X07) = 1, X09) = 1, 


and so the L-series are 


1 1 1 1 
LZ) = ; 
I] 1 — p< p33 1 — p< pei 1 — p< ag 1 — p< 
1 


L3(z) = || 


p=l 
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1 1 1 1 
Pa Wigs eel ae Uap a oe 


p=l 


and 


Loz) = [| 7 calves Urey leer 


p=! 


(Here Siz > 1 to insure convergence and the subscripting of the 
characters is used to reflect the isomorphism of the dual group and 
the original group.) 

The generating function for the primes in the arithmetic pro- 
gressions ((mod 10) in this case) are then linear combinations of 
the logarithms of these L-series. And so indeed the crux is the 
nonvanishing of these L-series. 

What could be more natural or more in the spirit of Dirichlet, but 
to prove these separate nonvanishings altogether? So we are led to 
take the product of all the L-series! (Landau uses the same device to 
prove nonvanishing of the L-series at point 1.) 

The result is the Dirichlet series 


1 1 
Z = 
©=Hq-y Ua 


1 1 
ne Il (1 fe p-*) I] dd im po)?’ 


p=7 


and the problem reduces to showing that Z(z) is zero-free on tz = 1. 

Of course, this is equivalent to showing that [| ,_, = — is zero- 
free on iz = 1, which seems, at first glance, to be a more attractive 
form of the problem. This is misleading, however, and we are bet- 
ter off with Z(z), which is the product of L-series and is an entire 
function except possibly for a simple pole at z = 1. (See the 
appendix.) 

Guided by the special cases let us turn to the general one. So let A 
be a positive integer, and denote by Ga the multiplicative group of 
residue classes (mod A) which are prime to A. Set h = (A), and 
denote the group elements by 1 = n, 12, ...,,. Denote the dual 
group of G4 by G a and its elements by x1, Xn,,---» Xn, atranged 
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so thatn; < x,, 18 an isomorphism of G and G. Next, for tz > 
1, write: L4(Z) = a bp etes oe and finally set Z(z) = 
II, Ly (z). AS in the cee A = 10, elementary algebra leads to 
Z(z) = Els ess rao ve where h; is the order of the group 
element nj. 

As before, Z(z) is entire except possibly fora simple pole at z = 1, 
and we seek a proof that Z(1+ia) # 0 forreala. So again we assume 
Z(1 + ia) = 0, form Z?(z)Z(z + ia)Z(z — ia), and conclude that 
it is entire. We note that its logarithm and hence that it itself has 
nonnegative coefficients so that (1) is applicable. 

So, with dazzling speed, we see that a zero of any L-series would 
lead to the everywhere convergence of the Dirichlet series (with 
nonnegative coefficients) Z7(z)Z(z + ia)Z(z — ia). 

The end game (final contradiction) is also as before although 2 
may not be among the primes in the resultant product, and we may 
have to take some other prime 7. Nonetheless again we see that the 
subseries of powers of z diverges at z = 0 which gives us our QED. 


Appendix. A proof that the L-series are everywhere analytic func- 
tions with the exception of the principal L-series, L; at the single 
point z = 1, which is a simple pole. 


Lemma. For any 6 in [0,1), define f(z) = ~~, =a 7: ai — for 
tz > 1. Then f(z) is continuable to an entire function. 


PROOF. Since, for #z > 1, fp eet? "dt = Gtge fo eT X 


14; — To) : 
at = faceyes OY summing, we get 


S ! = : © ase x tl dt 
(n—-O T(z) Jo et —1 


or 
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eft 


Since 5 — co is analytic and has integrable derivatives on [0, oo), 
we may integrate by parts repeatedly and thereby get 


1 1 
ar eee 


lee) k cs 
=> eee / fae d et —_ a ‘ petk-lat 
T(z +k) Jo dt e —1 t : 


This gives continuation to Siz > —k, and, since k is arbitrary, the 
continuation is to the entire plane. 
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Problems for Chapter VI 


1. Prove, by elementary methods, that there are infinitely many 
primes not ending in the digit 1. 


2. Prove that there are infinitely many primes p for which neither 
p +2 nor p — 2 is prime. 


3. Prove that at least 1/6 of the integers are not expressible as the 
sum of 3 squares. 


4. Prove that I'(z) has no zeros in the whole plane, although, it has 
poles. 


5. Suppose 6(x) decreases to 0 as x — oo. Produce an ¢(x) which 
goes to 0 at oo but for which 6(xe(x)) = o(€(x)). 


Vi 


Simple Analytic Proof of the 
Prime Number Theorem 


The magnificent Prime Number Theorem has received much atten- 
tion and many proofs throughout the past century. If we ignore the 
(beautiful) elementary proofs of Erdos and Selberg and focus on the 
analytic ones, we find that they all have some drawbacks. The origi- 
nal proofs of Hadamard and de la Vallée Poussin were based, to be 
sure, on the nonvanishing of ¢(z) in fiz > 1, but they also required 
annoying estimates of ¢(z) at oo, because the formulas for the coef- 
ficients of the Dirichlet series involve integrals over infinite contours 
(unlike the situation for power series) and so effective evaluation 
requires estimates at oo. 

The more modern proofs, due to Wiener and Ikehara (and also 
Heins) get around the necessity of estimating at oo and are indeed 
based only on the appropriate nonvanishing of ¢(z), but they are 
tied to certain results of Fourier transforms. We propose to return 
to contour integral methods to avoid Fourier analysis and also to 
use finite contours to avoid estimates at oo. Of course certain errors 
are introduced thereby, but the point is that these can be effectively 
minimized by elementary arguments. 

So let us begin with the well-known fact about the ¢-function (see 
Chapter 6, page 60-61) 


(z — 1)(z) is analytic and zero-free throughout Jiz > 1. (1) 


This will be assumed throughout and will allow us to give our proof 
of the Prime Number Theorem. 
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In fact we give two proofs. This first one is the shorter and 
simpler of the two, but we pay a price in that we obtain one of 
Landau’s equivalent forms of the theorem rather than the standard 
form z(N) ~ N/ log N. Our second proof is a more direct assault 
on z(N) but is somewhat more intricate than the first. Here we find 
some of Tchebychev’s elementary ideas very useful. 

Basically our novelty consists in using a modified contour integral, 


z | < 


rather than the classical one, f/. f (z)N*z~'dz. The method is rather 
flexible, and we could use it to directly obtain 7(N) by choosing 
f(z) = log ¢(z). We prefer, however, to derive both proofs from the 
following convergence theorem. Actually, this theorem dates back 
to Ingham, but his proof is 4 la Fourier analysis and is much more 
complicated than the contour integral method we now give. 


clearly converges to an analytic function F(z) for %z > 1. If in 
fact, F(z) is analytic throughout Kz > 1, then )\ a,n~* converges 
throughout Kz > 1. 


Theorem. Suppose |a,| < 1, and form the series )* a,n~* which 


PROOF OF THE CONVERGENCE THEOREM. Fixawin%stw > 1. 
Thus F(z + w) is analytic in Siz > 0. We choose an R > 1 and 


determine 6 = 6(R) > 0,6 < ; and an M = M(R) so that 


F(z + w) is analytic and bounded by M in — 6 < ‘iz, |z| < R. 
(2) 
Now form the counterclockwise contour I bounded by the arc |z| = 
R,Rz > —6, and the segment Jiz = —6d, |z| < R. Also denote by 
A and B, respectively, the parts of T’ in the right and left half planes. 
By the residue theorem, 


2niF(w) = / 


r 


F(z + w)N* (; + <) dz. (3) 


Now on A, F(z + w) is equal to its series, and we split this into 
its partial sum Sy (z + w) and remainder ry(z + w). Again by the 
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residue theorem, 


s id eee 
ery 71 R Zz 


1 
= 2niSy(w) — | Sule + wow (2 + a) &, 
A 


with —A as usual denoting the reflection of A through the origin. 
Thus, changing z to —z, this can be written as 


1 
[ svc + w)N* (; + =) dz 
= 27iSy(w) — / Sy(w — z)N~* (; + z) dz. (A) 
A 


Combining (3) and (4) gives 
2ri[F(w) — Sy(w)] 


7 e. Sy(w — z) 1 Z 
= [ [mee ww Ne \(- + i) az (5) 


| Z 
+f Peewee ae RS 
B Zz Re 
and, to estimate these integrals, we record the following (here as 


usual we write Siz = x, and we use the notationa « 6 to mean 
simply that |a| < ||): 


Bde Ate aisap ie a anipaticulan one) (6) 
= alon = In parti 
: R R2 ong |Z p Cular on 5 
1 g 1 [z|? 2 
Stan (14% = 5 on the line Rz = —6, 
Il < R, (7) 
cs 1 %° dn 1 
< —_ = a7: 8 
rn (Zz + w) <K 23) nxt] — [ ntl x Nx ( ) 


and 


N N 
Sy(w —z) < Le < Nt! +f n*—'dn 
0 


n=1 
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_ NE 1 1 9) 
ee). ) 


By (6), (8), (9), on A, 


S _ 1 
rote Ne = “2 | (; + =) 


1 1 1 
< 
«(+ -+y) es Rt RN’ 
and so, by the “maximum times length” estimate (M—L formula) for 
integrals, we obtain 


7 1 4n 2 
[ [net ww - #S=2 (2 Blac B+. 
A f Ve R2 


N R N 
(10) 
Next, by (2), (6), and (7), we obtain 
1 
[ Fe+mw (2 + 5) a 
B 
R 2 air -3 
< M.N?— say +2M fn ny 5x (11) 
—R 
4MR 6M 
<< 


eS ee 
~ 8N® — R? log? N 
Inserting the estimates (10) and (11) into (5) gives 


EC eC ee ee 
a eae ala Ga 1 Se ad 
and, if we fix R = 3/e, we note that this right-hand side is < ¢€ for 
all large NV. We have verified the very definition of convergence! 


First Proof of the Prime Number Theorem. 


Following Landau, we will show that the convergence of )>, 4“? 
(as given above) implies the PNT. Indeed all we need about this 
convergent series is the simple corollary that }° vy w(n) = o(N). 
Expressing evn in terms of the ¢-function, then, we have 
established the fact that => > has coefficients which go to 0 on average. 
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The PNT is equivalent to the fact that the average of the coefficients 
of c (z) is equal to 1. For simply note that 


‘s d d 1 
eee | So Sea, oa Ss 
> @) = - 5 logs(e) = — 5 [I =e 
d s p‘logp 
= )> — log(1 - p“) = — 
p a re tet 2a 

_ log p 

P p*—1 


This last series is the same as )> ““ where A(n) is log p whenever 
nisapower of p, p any prime, and 0 otherwise. So indeed the average 
of these coefficients is + )~,-y A(7) whose limit being 1 is exactly 
the Prime Number Theorem. 

In short, we want the average value of the coefficients of — 7 (z) — 
€(z) to approach 0. Writing this function as 


1 ; p(n) logn d(n) 
7 Ol-$ @-¢@l=) =, bs eee Oger |. 


we may write this average (of the first N terms)as 


1 
Vy DL Hallog b — d(b)] 


N ab<N 
Y~ u(a)[log b — d(b) + 271 = 
—- (@) — = 5 
N 2 Hallog wa 


where 2y is chosen as the constant for which 


K 
) “[log b — d(b) + 2y] 


b=1 


becomes O(\/K). 
Now we use the Landau corollary that }7,-y w(n) = o(N) to 
conclude that 


I 
wy de Hen) « BN), 


n<N 
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where 5(N) tends to 0, and our trick is to pick a function w(NV) which 


approaches oo but such that w(N)6é (tn) approaches 0. 


This done, we may conclude that 


N N 
pa =O || +O [woe ||| 
= N+0d0(N), 


and the proof is complete. 


Second Proof of the Prime Number Theorem. 


In this section, we begin with Tchebychev’s observation that 


1 
be Be Ee: logn is bounded, (12) 


psn 


which he derived in a direct elementary way from the prime 
factorization on n! 
The point is that the Prime Number Theorem is easily derived from 


> 052 —logn converges to a limit, (13) 
psn 
by a simple summation by parts, which we leave to the reader. Nev- 
ertheless the transition from (12) to (13) is not a simple one, and we 
turn to this now. 
So, for }iz > 1, form the function 


ro- ELE m)-r lp 


n=l pan 


Now 


1 1 ~ 1 —{t 
) - - z | ean 
ne (z _ 1) p=! 5 pet 


1 
Ss Se +Av() 


Second Proof of the Prime Number Theorem. 73 
where A,(z) is analytic for Itz > 0 and is bounded by 


Iz(z — 1)| 
PD) xp*t! 


Hence, 


ol log p 
PONE ae ; 


where A(z) is analytic for Siz > ; by the Weierstrass M-test. 
By Euler’s factorization formula, however, we recognize that 


lo —d 
> ce P = —— logé(z), 
an —1 dz 


and so we deduce, by (1), that f(z) is analytic in tz > 1 except for 
a double pole with principal part 1/(z — 1)? + c/(z — 1) atz = 1. 
Thus if we set 


F@ =f@™+/@-c&@ = oF 


where 


r 
Gi = — logn —c, (14) 


we deduce that F(z) is analytic in tz > 1. 
From (12) and our convergence theorem, then, we conclude that 


ay 
) — converges, 
n 


and from this and the fact, from (14), that a, + log n is nondecreasing, 
we proceed to prove a, — 0. 
By applying the Cauchy criterion we find that, for N large, 


N(i+e) 
ee ee (15) 
N n 
and 
yy a 
oe See (16) 
n 
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In the range N to N(1 + €), by (14), a, > ay + log(N/n) => 
ay —€.S0 N"*? a,/n > (ay — ©) 2h"*? 1/n, and (15) yields 


a eee Es ee A 
€ = : 
oe Ne/N(1 + €) 
Similarly in[N(1—€), N],a, < ay +log(N/n) < ay+e/U—- 
€), so that 


an KEt+ 


N(1-e) N(-e) 

and (16) gives 
Seta ie ee a, S28 Ee _ & —2 
era ee ee ee ae ee l-e« | 


(18) 
Taken together, (17) and (18) establish that ay — 0, and so (13) is 
proved. 
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Problems for Chapter VII 


1. Given that }’ @ converges, prove that yy an = O(N). 


2. Given that >> o converges and that a, — d,_; > —, prove that 


n 
a, — 0. 


3. Show that d(n), the number of divisors of n, is O(n*) for every 
positive €. 


4. In fact, show that d(n) « n * 1 é 
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